Data README file
S. Goodman 9-29-2015

This document explains how the files are organized to replicate the findings in 
�Learning from the Test: Raising Selective College Enrollment by Providing Information� by Sarena Goodman.

The STATA code and CSV files are annotated, so it will be clear which variables and/or lines of code generate each table and figure. 

Figure 1 and Figure 4
	DATA FOR FIGURE 1 AND TO DEFINE ACT STATES: "figure1_actparticipationrates.csv"
	DATA FOR FIGURE 4: COMBINE "implied_returns_thresholds.csv" and "returns_estimates_from_lit.csv"

Figure 2, Table 3, and Table 4
	THESE ARE THE ACT-TAKER RESULTS - THEY RELY ON PROPRIETARY DATA, WHICH ARE AVAILABLE BY APPLICATION (FULL DETAILS BELOW)
	FIGURE 2 CAN BE PRODUCED BY RUNNING "table3_4" AND GRAPHING THE UNDERLYING SCORE DATA BY GEOGRAPHY-PERIOD
	TABLE 3 CAN BE PRODUCED BY FIRST RUNNING "table3_4.txt" with the ACT datafiles as described in the main text AND THEN "firststage.txt" with the cleaned data
	TABLE 4 CAN BE PRODUCED BY RUNNING "table3_4.txt" (some estimates are generated by combining the results in Tables 3 and 4)

Figures 3a and 3b, Tables 6a and 6b, and Table 7
	MAIN REDUCED FORM RESULTS - CLEANING STEPS TO CREATE "mainregsample.dta"
	1. The enrollment data by state of residence are pulled by year for all U.S. higher educational institutions and stacked across even years 1994-2010 to form "stacked_ipeds.dta"
	2. Data are reshaped wide by year so that each observation is an institution and state of residence. 
		Enrollment cell-years contain the total number of students at that school from a given state in that year.
	3. Each institution is associated with a Barron's selectivity type (available upon request from Dr. Lesley Turner at the University of Maryland). 
	4. Data are collapsed by state of residence and selectivity. Enrollment cell-years now contain the total number of students from each state attending a selectivity type by year.
	5. Cumulative selective categories are defined from Barron's: 
		"overall" is any category; "competitive" is any category above and including "less competitive"; selective is any category above and including "competitive"
	6. Data are collapsed and reshaped so that each observation is a state of residence-enrollment year. Enrollments are expressed as natural logs.
	7. Demographic controls are merged from Census and BLS data
	8. Non-ACT states are removed from sample. (See "figure1_actparticipationrates.csv" for list)
		TABLES 6a, 6b, and 7 CAN BE PRODUCED BY RUNNING table6a_7.txt in Stata with the mainregsample.dta
		Figures 3a and 3b CAN BE PRODUCED BY SUMMING OVER STATE TYPES FOR EACH CATEGORY AND YEAR, AND INDEXING TO 1994 ENROLLMENT


The STATA datasets used for the project are proprietary and cannot be freely posted online. 
Researchers interested in the data for replication purposes may submit a completed ACT data request form ("ACT Data Request Form July 2015".docx) 
to ACT, Inc. via email to Ken Bozer (ken.bozer@ACT.org) or mail the form to the address at the bottom of the form. 

